feat: add support for FastSAM model with point, box and text prompts#1120
Conversation
msluszniak
left a comment
There was a problem hiding this comment.
Do we want to add some benchmarks for this one?
chmjkb
left a comment
There was a problem hiding this comment.
I tested the demo app on iOS and the results vere pretty mid, at least for the S version. Not sure if this is the nature of the model, but just saying
Probably the nature of the model. You can share what results you get and I can do a cross-check. |
…pts.ts Co-authored-by: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com>
From what I've tested, the S variant is fine for simple segmentation when objects don't overlap, but for more complex scenes it's true that artifacts show up. The X variant however worked fine on all images I tried, even ones with quite complex scenes. Did you observe bad performance on X variant also? |
|
@msluszniak @chmjkb I've:
I will be also adding benchmarks shortly. |
msluszniak
left a comment
There was a problem hiding this comment.
LGTM from my side. We can add tip to documentation, that for images with overlaps of entities, prefer fastSam-X over smaller version.
I think the topk should be enough, no need to over-engineer this. I'll review it now and if it looks ok then |

Description
Adds support for FastSAM model with required postprocessing for point, box and text (using already existing CLIP export) prompts. Also adds an example app to test these.
Since FastSAM uses YOLO instance segmentation backbone with some clever postprocessing to imitate Facebook's SAM (see https://docs.ultralytics.com/models/fast-sam/#model-architecture), we use the existing instance segmentation C++ implementation and add TS postprocessing to minimize code duplication.
Introduces a breaking change?
Type of change
Tested on
Testing instructions
Screenshots
You can use following image for testing.
https://upload.wikimedia.org/wikipedia/commons/c/cd/Animal_diversity_October_2007.jpg
Related issues
Closes #555
Checklist
Additional notes